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Abstract. Graphs and graph transformation systems are a frequently 
used modelling technique for a wide range of different domains, cover- 
ing areas as diverse as refactorings, network topologies or reconfigurable 
software. Being a formal method, graph transformation systems lend 
themselves to a formal analysis. This has inspired the development of 
various verification methods, in particular also model checking tools. 

In this paper, we present a verification technique for infinite-state graph 
transformation systems. The technique employs the abstraction principle 
used in shape analysis of programs, summarising possibly infinitely many 
nodes thus giving shape graphs. The technique has been implemented 
using the 3-valued logical foundations of standard shape analysis. We 
exemplify the approach on an example from the railway domain. 



1 Introduction 

Graph transformation systems (GTSs, CMR+97 ) have - in particular due to 
their visual appeal - become a widely used technique for system modelling. They 
are employed in numerous different areas, ranging from the specification of visual 
contracts for software to dynamically evolving systems. They serve as a formally 
precise description of the behaviour of complex systems. Often, such systems 
are operating in safety critical domains (e.g. railway, automotive) and their de- 
pendability is of vital interest. Hence, a number of approaches for the analysis 
of graph transformation systems have been developed |Tae03j , in particular also 
model checking techniques |RSV04l ISV031 IRen03l IBCK08I IBBKR08I ISWJ08j . 
Model checking allows to fully automatically show properties of system models, 
for instance for properties specified in temporal logic. Model checking proceeds 
by exploring the whole state space of a model, i.e. in case of graph transforma- 
tion systems by generating the set of graphs which are reachable from a given 
start graph by means of rule application. While existing tools have proven to be 
able to tackle also large state spaces, standard model checking techniques fail 
when the state space becomes infinite. 



There are, in general, two approaches to dealing with very large or even infinite 
state spaces. The first approach is to devise a clever way of selecting a finite 
subset of states which is sufficient for proving the desired properties, effectively 
constructing an under- approximation of the system. This concept is explored 
e.g. in bounded model checking |BCC + 03j . The second approach is to construct 
an abstraction, i.e. a finite representation of a superset of the state space. This 
over- approximation of the system is then used to show certain properties of the 
original system. 

In this paper, we propose an new approach towards a verification technique 
for infinite state graph transformation systems using over-approximation. The 
technique follows the idea of shape analysis algorithms for programs [SRW02 
which are used to compute properties of a program's heap structures. Shape 
analyses compute abstractions of heap states by collapsing certain sets of iden- 
tical nodes into so-called summary nodes. Thereby, an infinite number of heap 
states can be finitely represented. Such shapes can be used to derive structural 
properties about the heap. 

This principle of summarisation in GTSs has already been presented in nu- 
merous other works, for example Rensink et al. |Ren04( IRD06j or Bauer et.al. 
[BBKR08J which introduce so-called abstract graph transformations. Our ap- 
proach is set apart from these by strict adherence to the formalism presented 
in [SRW02] . which immensely simplifies implementation and gives us a level 
of parametrization that other approaches lack. A more thorough discussion of 
advantages of our approach over related work will be presented in section [6] 

Here, we present a shape analysis for GTSs which is directly based on the 
3-valued logical foundations of standard shape analysis. Given this logical basis 
for shape graphs, we define rule application on shape graphs via a constructive 
definition of materialisation and summarisation. The technique can thus be di- 
rectly implemented as defined, even re-using parts of the logical machinery of 
TVLA [BLARS07] , the most prominent shape analysis tool. In order to illustrate 
our technique, we exemplify it on a simple GTS model from the railway domain. 

The paper is structured as follows. The next section will give the basic defini- 
tions for our approach. Section [3] introduces materialisation and summarisation 
on graphs and thereby defines the application of rules on shape graphs. Sec- 
tion [4] shows the correctness of our approach, i.e. shows that by rule application 
on shape graphs an overapproximation of the set of reachable graphs is com- 
puted. The next section then reports on the implementation. Finally, Sect. [6] 
concludes, further discusses related work and gives some directions for future 
research. 

2 Background 

This section introduces the basic definitions that are required to formulate our 
main results. To illustrate the definitions in this section, we use the following 



example from the rail domain. A rail network is given by a set of stations (S) and 
a set of rail sections, called tracks (T), connected by a relation called "next". Ve- 
hicles, called "railcabs" (RC), possibly with passengers (P) travel on the tracks. 
The example is a simplified version of a case study coming from the project 
"Neue Bahntechnik Paderborn"F| Figure [l] shows a graph depicting one configu- 




Fig. 1: A simple rail network 

ration of such a rail network. Configurations can change in a number of ways, for 
instance by passengers entering railcabs and railcabs moving on tracks according 
to predefined protocols. The overall goal is to show certain safety properties (e.g. 
collision avoidance) for arbitrary networks. We first of all start by defining some 
basic notions on graphs. 

Definition 1. A graph G is a pair (N,E), where N is a set of nodes and 
EQNxCxNisa set of labelled edges for some label set C. For any graph 
G, Nq and Eq denote its node and edge sets, respectively. 

This definition restricts the class of graphs we are considering to those in which 
no more than a single same-labelled edge may exist between any two nodes. The 
generic concept of a morphism extends to these graphs in a natural way. 

Definition 2. For graphs G and H , a morphism f : G — > H is a function f : 
Nq — > Nh extended to edges by f (n, I, n') = (/(n), l,f (n')) such that f (Eg) C 
Eh- 

Figure [T] shows a graph representing one very simple rail network consisting of 
two stations which are connected by two tracks. Note that we include a simple 
notion of typing in the graph. The type of a node is represented by a loop labelled 
with the name of the type. Such type loops are not displayed as edges but rather 
as part of the node name. Thus, instead of displaying a self-edge of n labelled 
"RC", we label the node n : RC. 



1 http://nbp-www.upb.de 
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Fig. 2: Rule EnterStation 



In order to model the dynamic behaviour of a system represented by a graph, 
we need to transform graphs into other graphs. For this, graph production rules 
can be used. In this paper, we take an operational, not categorical, view on 
graph transformation. As a consequence, we favour a simple approach to graph 
production rules, as the following definitions show. 



Definition 3. A graph production rule P = (L. R) consists of two graphs L and 
R called the left hand side and the right hand side, respectively. 



Figure [2] shows an example of a rule which describes a railcab entering a station. 
In addition, we have rules for leaving a station, for movement of single as well as 
convoys of railcabs and for forming convoys (all elided due to space restrictions). 
In the rules, we use node names instead of injective morphisms to identify nodes 
appearing in the left as well as right hand side. Aside from this technicality we 
use the standard SPO approach to rule definition and application [Low93j . In 
order to make node creation and deletion explicit, we use the following sets: 

N~ =N L \ N R , E" = E L \ E R (deleted nodes and edges) 

N + = N R \N Ll E + = Er\ E l (created nodes and edges) 



These sets are used to define the effect of an application of a production rule on 
a graph G. 

Definition 4. Let P be a production rule, G a graph. The rule P = (L, R) can 
be applied on G if we can find an injective morphism m : L — > G ( called a 
matching ). 

If m is a matching, then the application of P onto G with matching m is the 
graph 

H = (N H ,E H ) with 
N H = (N G \m (7V~)) U N + 

E H = ((E G \ m (E~)) U m (E + )) n (N H x C x N H ) 



where fh — m U id^+ . 



For this production application we write G P ' m > H . Similarly, G H holds 
if there is some m such that G ' ro > _ff and G — >• 77 if there is furthermore 
a production rule P such that G 77. We let — >* denote the transitive and 
reflexive closure of — >. With these definitions at hand, we can define the set 
of reachable graphs of a graph transformation system (or graph grammar, as it 
includes a start graph). 

Definition 5. A graph transformation system GT = (Go,(Pj) ieJ ) consists of 
a start graph Go and a set of production rules Pi,i G I. The set of reachable 
graphs of a graph transformation system GT is 

reach(GT) = {G | G ^* G} 

In this paper we are interested in proving properties of the set of all reachable 
graphs. A property can for instance be the absence of forbidden patterns, i.e. 
substructures, in a graph (or the presence of desired patterns). 

Such a forbidden pattern can be de- 
fined by a production rule of the form 
P = (F, F) (with left and right hand 
equal). The pattern is present in a 
graph G (G |= F) if the rule matches. 
A forbidden pattern for our example 
is given on the right hand side. 
It specifies a collision of two railcabs 
(two railcabs on one track). 
The set of reachable graphs can in general be infinite (e.g. for our example, if 
we introduce a rule which allows new passengers to be created). The objective 
of this paper is to construct an abstraction (and overapproximation) of this set 
of reachable graphs which is finite but on which we can still show properties. 

Before doing so, we need to look a bit closer into the basic technology be- 
hind shape analysis. Shape analysis algorithms operate on logical a structure 
using first order logic to formulate properties. In the following, we closely follow 
[SRW02J in our notations. Note, however, that we explicitly exclude the notion 
of transitive closure from |SRW02| . since transitivity would violate the impor- 
tant locality property of rule applications. The word formula always refers to a 
first order formula over a set of predicate symbols V and variables V. Variables 
are assigned values from some domain (or universe) U , and fc-ary predicates Vk 
are interpreted by truth- valued functions, i.e. we have an interpretation function 
t : Vk — > (U k — > T) (fa set of truth values). We let T (ip) denote the set of 
free variables of a formula ip. Domain, predicates and interpretation function 
together make up a logical structure S = (U,V,l), sometimes also abbreviated 
by (U, i). For a formula ip, fipj^ denotes the value of ip in the structure S under 
an assignment m. 

A logical structure is called n-valued if for the target set T of the predicates 
|T| — n holds. Two sets of truth values will play a role here: the ordinary 
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Fig. 3: Klccnc logic with logical and information order 

boolean values (T = {0, 1}) and the three-valued set of Kleene logic (T = 
{0, 1, values called false, true and maybe). On the truth values of Klccnc 
logic we have two different orderings (see Fig. [3| , one reflecting the amount of 
information (C) present in a logical value, the other the logical truth (<). That 
is, for h,heT = {0,1,1}: 

k E h ^ (h = h) V (h = |) 

k<h^ (h = h) V (h = 0) V (h = | A h = 1) 

Our final goal is to represent graphs as well as their abstractions by logical 
structures, the former by 2-valued and the latter by 3-valued. To this end, we 
partition the set V into the sets of so-called core predicates C and instrumentation 
predicates X. Later on, C will encode basic properties (like neii-relations between 
nodes) , while X will be used to increase the precision of the analysis with respect 
to a given property. The set C is further subdivided into unary core predicates C 1 
(used e.g. for types) and binary core predicates C 2 . One specific predicate called 
summarised (sm) is used in the abstraction: a summarised node can represent 
lots of concrete nodes. In ordinary graphs no nodes are summarised. 

The encoding of graphs as logical structures then works as follows: The set of 
nodes TV of a graph will be represented by the domain set U. The edge labels C 
will give us the set of predicate symbols V, and particular edges are encoded by 
l. Table [l] gives the logical structure of the rail network of Fig.[l] 

Definition 6. Let G be a graph. The 2-valued encoding of G, denoted ls(G), 
is a 2-valued logical structure S = ( U, V, l) with U = Nq, C 1 U C 2 = V = C and 
i defined by: 

- For p G C 2 : i(p) (ui,u 2 ) = 1 4$ (u Xl p, u 2 ) G E G , 

- For p eC 1 : i{p) (u) = 1 (u,p, u) e E G , 

- For sm: i (sm) (u) = 0. 

The basic idea of shape analysis is to represent infinitely many different but in 
shape similar configurations or graphs by one shape graph. A shape graph thus 
cannot always give us precise information about the number of nodes, nor can 
it give us precise information about edges between nodes. The third truth value 



U = {n, r 2 , si, s 2 , h,t 2 } 
C 1 = {RC, T, S, sm} ,C 2 = {on, next} 
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Table 1: Logical structure of the graph in Figure 1 
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Fig. 4: Start shape graph 



"maybe" (|) is used to represent this fact in logical structures. A node repre- 
senting many concrete nodes is summarised, denoted by a dashed rectangle, an 
edge which is only "maybe" there (dashed line) is assigned the truth value |. 
Figure [4] shows a shape graph. Here, we for instance have RC(r\) = 1 (r\ is 
definitely of type railcab), sm(t) — | (there is possibly more than one track) 
and next(t,t) = h (the tracks summarised in t maybe connected). The pred- 
icate is_colliding will be explained later. This is the start shape graph of our 
reachability analysis. 

Shape graphs are abstractions of concrete graphs, concrete graphs can be 
embedded into them. Clearly , the predicate sm plays a crucial role in embeddings. 
Interpretations of this predicate are restricted to values and g . If it is | for an 
individual u, this means u may or may not stand for a whole set of nodes. If it 
is for u, then u is guaranteed to represent a single individual. 

We thus obtain the notion of embedding by the following definition. 



Definition 7. Let S — (U ,"P, i) , S' — (U',V,i') be two logical structures and 
f : U — >• U' be a surjective function. We say that f embeds S in S' (S C S' ) iff 



V k V p g Vk , V u\ , . . . , Mfe e {/ 



t (p) . . . , u k ) C i' (p) (/ ,...,/ (u k )) 



(1) 



and Vti'6 V 



(\{u\f(u) 



u'}\ > i) c i'HM 



(2) 



Thus, intuitively, means that S" is in some way a "generalisation" of S. 



3 Rule Application on Shape Graphs 

The basic idea of shape analysis follows that of abstract interpretation: instead of 
looking at concrete graphs and applying graph transformation rules concretely, 
we look at shape graphs and apply our rules to shapes instead. We thus induc- 
tively compute the set of reachable shape graphs and on these check for forbid- 
den patterns. If the forbidden pattern is absent in this set, it should also not be 
present in any concretely reachable graph. In this section we will explain how to 
apply rules on shape graphs, the next section will look at the soundness of this 
technique. 

Rule application on shape graphs involves a number of distinct steps, some of 
which are not present on concrete graphs. The basic difference is that due to the 
"maybe" predicates in shapes, we usually do not find an exact counterpart, i.e. 
an injective matching, for the left hand side. The following steps are necessary: 

Match To find out whether a rule P matches, we evaluate a rule formula Lp P 

in the logical structure of the shape graph. If it evaluates to \ , the rule can 

potentially be applied. 
Focus In order to actually apply a potentially applicable rule, we have to bring 

the left hand side of the rule into focus. We do so by materialising the left 

hand side in the shape graph. 
Coerce Materialisation concretises parts of the shape graph. This concretisa- 

tion has an influence on the rest of the shape (e.g., if a railcab is definitely 

on one track, it cannot at the same time be "maybe" on another track). 

Coercing removes "maybe" structures in the shape by inspecting definitely 

known predicates. 

Apply After materialisation and coercion the rule can be applied, basically as 
on concrete graphs. 



We next go through each of these steps. To define matching, we transform the 
left hand side of a rule into a formula. 



Definition 8. Let P = (L, R) be a graph production rule. The production for- 
mula tpp corresponding to P is given by 



lPp = A l(n,n')/\ A l(n)/\ A ^(ni = n 2 )A A -^sm(n) 

(n,l,n')eE L (n,l,n)eE L ni,n 2 eJVi n£)Vi 

( binary I unary ni^Tta 



non- summarisation 



edges loops injectiuity 



The =/= here means the non-equality of the variable symbols, while = is a regular 
predicat^ 



When this formula evaluates to \ (1) for a 3-valued structure, we say that 
the rule is potentially applicable (applicable) in the associated shape graph. To 
actually apply it, we have to bring the rule into focus, i.e. make sure that we 
definitely find the left hand side of the rule in the shape. Intuitively, we would 
want something like this 

focus P (S) = {S 1 | S' C S A 3 m : { VP f m = l} 

meaning all possible graphs which are embeddable in S and to which the rule 
can be applied. Unfortunately, this set can be infinitely large, and in fact, this is 
exactly what our technique tries to avoid, namely having to construct all concrete 
graphs for a shape. Instead, we only compute a set matp (S) (the materialisation 
with respect to a rule) such that each element in focusp (S) can be embedded 
in at least one element from matp (S), but still the rule is applicable in every 
shape in matp (S). 

In order to construct the set matp (S), let us now assume that we have a 
shape graph G, its corresponding logical structure S — (Nq,C, t), a production 
rule P — (L, R), and a matching m : L — > No which gives rise to an assignment 
m : T (tpp) — > No such that [yp]^, ^ 0. Let N G um be the set of summary nodes 
in G. We have to exactly find the left hand side of the rule in the materialisation. 
Thus, every node u in T(m) := m(L) n N G um needs to be materialised into as 
many nodes as are mapped onto u via m. The relationship of these materialised 
node to other nodes of the shape are inherited from the original shape graph. In 
addition, we have to decide whether to keep the summarised node out of which 
have made our materialisation, or to remove it. This represents the idea that 
summarised nodes can stand for any number of concrete nodes. Thus we get 
several materialisations of one shape graph, one for every set / C r(m), I being 
those now materialised nodes for which we keep the original summary node. 



Definition 9. Let G be a graph, S = Is (G) = (U,V,t), P = (L, R) be a 
production rule and M = {m \ lppj m = |}. Let P (m) := m(L)f] N G um . Then, 

2 Given an assignment m, two variables x\ and X2 are considered equal if they are 
mapped onto the same node by m and this is not a summary node. 



for each m e M and each I C T (to) the materialisation of P according to 
(m,I) is defined as mat^ (S) = (U 1 ,'P,i 1 '), with 

U 1 = U\{m (N L ) \I)UN L 

and for p G C 2 and q G C 1 \ {sm}, letting fh = m U idjj : 

1 if u G N L A (it, q, it) G E L 

t{q) {fh (it)) else 

1 if u 1 ,u 2 E N L A{u u p,u 2 ) G E L 

i{p) {fh (ui) , fh {112)) else 

ifueN L 
i{sm) {fh (it)) else 

The collection of all such logical structures is then defined as the materialisation 
of S with respect to P: 

matp {S) = {S I 3 m : ly>plm = -*-} regular rule application 

U {matf n {S) I [<^p]]m = h,I C P(m)} materialisations 

Note that the size of mat l m can be exponential in the number of nodes in the 
left hand side of the rule, but is finite. The following theorem states that it is 
indeed sufficient to consider matp {S) instead of focus p {S). 

Theorem 1. Let S be a Z-valued logical structure and P a production rule. Then 

matp {S) C focusp {S) and (3) 
focusp {S) Q matp {S) (4) 

Due to lack of space we have to omit all proofs. They can be found in jSWWIO] . 
Fig. [5] shows the result of applying materialisation on the starting shape graph 
using the EnterStation production rule. 

The next step is coercion. After materialisation we apply the coerce operation 
defined in }SRW02j on the resulting shape graphs. Doing so serves two purposes: 
On the one hand we can identify inconsistencies in the shape graph (e.g. an 
empty track with a railcab on it). On the other hand we can "sharpen" some 
predicate values of the shape graph in some cases. The latter can be found, for 
example, when looking at the materialised shape graphs of Fig. [5] There, the 
empty predicate has the value \ for node t. Yet, the railcab r is definitely on it. 
Hence, we can sharpen the predicate value of empty to 0. 

The semantic knowledge needed to perform the coercion step comes from 
the so-called compatibility constraints. Compatibility constraints may be either 
hand-written formulae (e.g. 3 r : on{r, t) -^empty{t)) or may be formulae that 



i'{q){u) = 
b'{p) (wi, it 2 ) = 
i'{sm){u) = 




are derived from the so-called meaning formulae of the instrumentation predi- 
cates (see below for an discussion of instrumentation predicates and its meaning 
formulae). We do not explain coercion in detail here, for this see |SWW10] . 

Finally, we can now apply the production rule. Since the left hand side of the 
rule is - due to materialisation - explicitly present in the shape graph, this follows 
the standard procedure. There is however one speciality, related to the analysis, 
involved. To make the analysis more precise, we introduce special instrumen- 
tation predicates to our shape graphs. Consider again our forbidden collision 
pattern of the last section. To see whether this is present, we could evaluate the 
formula 



tpfarbidden-collision ■= on(rx, t) A on(r 2 , t) A T(t) A RC{rx) A RC{r 2 )h 
->(ri = r%) A -ism(ri) A -^sm{r2) A -ism(t) 

Unfortunately, for most shape graphs we find an assignment m such that this 
formula evaluates to h under m since we have lost information about the precise 
position of railcabs on tracks. This holds in particular also in our start shape 
graph. To regain this, we introduce an extra instrumentation predicate for this 
property: is-colliding. In our start shape graph for the reachability analysis 
this predicate is for all nodes (see Fig. [4] where the label is_colliding is not 
connected to any node; we thus start the analysis with a shape in which no 
two railcabs are on the same track) . Every concretisation of a shape graph with 
instrumentation predicates p has to obey its so-called meaning formula a p . For 
example the instrumentation predicate is-colliding has the following attached 



meaning formula: 



ais-coiudin a {v) ■= T(v) A 3 n, r 2 : {n ^ r 2 A on(ri, u) A on(r 2 , «)) 

Now, for a concrete graph G embedded in a shape S with node v mapped to u 
via the embedding function, we have to check that the evaluation of the meaning 
formula wrt. v yields the same or a more precise value (wrt. to the information 
order) than the instrumentation predicate value in u. Instrumentation predicates 
are (obviously) not part of our production rules. Therefore, we have to explicitly 
specify how these predicates change on rule application. For this purpose, we 
specify update formulae. 

Definition 10. A shape production rule P = {{L, -ft), 7) consists of a graph 
production rule (L, R) and function 7 mapping from each instrumentation pred- 
icate pel and each node v e Nr to a first-order predicate-update formula tp p . v 
with free variables in N L . 

The predicate- update formula ip p>v specifies how the value of the instrumentation 
predicate p should be calculated for each v € of the new shape graph with 
respect to the predicate values of the old shape graph. For example, we could 
attach the following update formulae to the production rule EnterStation: 

'is —colliding ,r 0> ^fis— colliding, s 

<Pis-coitiding,t = is -Colliding {t) A 



Note that we make use of a free variable called r in the formula <Pi S _ C olUding,t- 
When the production rule is applied to a shape graph S, this free variable gets 
assigned to the individual in S that represents the r node of the left hand side 
of the production rule. The following definition formalises the shape production 
application. 

Definition 11. Let P = ((L,R),j) be a shape production rule and S — (U,i) 
be a shape graph. The rule P can be applied to S if we find an injective func- 
tion m : Nl — > U (again called a matching) such that for all (n,p, n') € E^: 
L(p)(m(n), m(n')) = 1 (or i,(p)(m(n)) = 1 for p G C 1 ). 

If m is a matching, then the application of P onto S with respect to the matching 
m is the structure S' = (U'^l 1 ) with U' = (U \ m(N~)) U N + and t' defined as 
follows for p G C 2 , o G C 1 \ {sm}, q el, rh = m U idjv+, and u, U\, u 2 € U' : 



3 r 2 , r 3 : ((r 2 ^ r) A (r 3 ^ r) A (r 3 ^ r 2 ) A on(r 2 , t) A on(r 3 , t)) 



'0 

l'(o)(u) = \ 1 




if (u, o, u) G m(E ) 




JO ifueN+ 

1 i(sm)(u) else 

WS,™-^))]™ tfuem(N L ) 
b(q,u)l s m ifueN+ 
b(q)(u) else 

We write S P:m > S' if 5" is the result of applying P with matching m to S . 

We also use the notation — > to include all steps of materialisation, coercion and 
rule application, i.e. we write S — >■ 5" if S can be materialised into Si wrt. a rule 
P, then coerced into 62, P applied giving £3 and finally coerced into S'. Figure|6] 
shows the result of applying EnterStation on the coerced versions of the shapes 
of Fig. [5] 



t'(sm)(«) = 
i'{q){u) = 





Fig. 6: Applying rule EnterStation on coerced shapes. 



4 Soundness of Technique 



Using the methods of the previous section, we can now define the set of reachable 
shapes of a shape graph transformation system ST, where ST consists of a start 
shape graph Sq and a set of shape production rules (Pi, ji)iei- 

reach(ST) = max ({S | So— >* S}) 

Here, max is defined for a set of shape graphs XS as in [SRW02 : 
max(XS) := XS \ {X \ 3 X' e XS : X C X' A X 1 % X} 



The set of reachable shape graphs can be inductively constructed: we start with 
the initial shape graph and then successively apply the production rules. For 



each newly produced shape graph we check whether it can be embedded into or 
covers an already existing shape graph. Shape graphs that are covered by others 
are discarded. The following theorem states that this algorithm is sound, i.e. we 
do not miss any of the reachable graphs: 

Theorem 2. Let GT — (Go, (Pi)iei) be a graph transformation system, ST = 
(So, (Pj, 7i))j £ /)) an associated shape transformation system with Gq C So. Then 

reach(GT) C {G \ G 2-valued A G C S A S G reach(ST)} . 

Note that due to lack of space we have left out some extra conditions here 
referring to coercion and the compatibility constraints used therein. The full 
theorem and the proof can be found in [SWW10] . 

At the end, we have to check for forbidden patterns in the shape graphs. A 
shape graph S contains a forbidden pattern (F, F) (S (= F) if (1) there is an 
assignment m such that [vf]™ 7^ 0, i.e. if the pattern is (potentially) present in 
the shape, and materialisation and coercion give us at least one valid concreti- 
sation, i.e. (2) coerce(matp(S)) ^ 0. If a forbidden pattern is not contained in 
a shape graph, then it is also not contained in embedded concrete graphs. 

Theorem 3. Let S be a shape graph, (F, F) a forbidden pattern, G a graph 
such that GQS. Then S ^ F G ^ F. 

In summary this shows soundness of our technique: all reachable graphs are 
embedded in reachable shape graphs, and if we are able to show absence of 
forbidden patterns in the shapes this also holds for the concrete graphs. Note that 
due to the overapproximation the reverse is in general not true: we might find 
forbidden patterns in the shapes although none of the concrete graphs contain 
them. Instrumentation predicates are used to reduce such situations. 

Finally, a note on termination. If the algorithm is carried out as proposed 
above, it might not terminate although we only consider maximal shapes. This 
could occur if the production rules generate shapes which are all incomparable 
in the embedding order. To avoid this, one can introduce another abstraction 
step in the algorithm: Nodes which agree on all unary predicate valuations are 
collapsed into one. As we can only have finitely many combinations of predicate 
valuations this gives us finitely many different shape graphs. 



5 Implementation 

We implemented the verification algorithm in Java, making use of the source 
code of the shape analysis tool TVLA [BLARS07 . Thus we were able to take 
advantage of the already optimised code for logical structures provided by TVLA. 



Basically, our implementation loads a starting shape graph, a set of shape pro- 
duction rules, and a set of forbidden patterns, represented as text files each. 
Additionally, one needs to supply a text file listing the set of core and instrumen- 
tation predicates, the latter with their meaning formulae. The implementation 
then successively constructs the set of reachable shape graphs, each represented 
as logical structure, and checks whether a newly found shape graph contains one 
of the forbidden patterns. If the shape graph does contain a forbidden pattern, 
a counter example is generated that describes how the shape graph was con- 
structed as sequence of production applications. Otherwise, the shape graph is 
added to the set of reachable shape graphs and the maximum operation is ap- 
plied. If no new shape graphs can be found anymore and none of the reachable 
shape graphs contains a forbidden pattern, the implementation asserts that the 
given STS is safe. 

We tested our implementation using the running example on a 3GHZ In- 
tel Core2Duo Windows System with 3GB main memory. Our implementation 
needs about 250ms to verify that the running example STS is safe, i.e. no colli- 
sion happens. While doing so, it temporarily constructs 108 intermediate logical 
structures and finds 17 logical structures in the maximised set of reachable shape 
graphs. 

This and further case studies show that the most expensive operation in terms 
of runtime is the max operation. We implemented it by checking for each newly 
found shape graph whether it can be embedded in a shape graph in the (cur- 
rent) set of reachable shape graph or vice versa. Thus, for each newly found 
shape graph we need 2n embedding checks, if n denotes the number of shape 
graphs in the current set of reachable shape graphs. Furthermore, for arbitrary 
shape graphs checking for embedding is NP-complete ([AMSS06 ). Hence it is 
not surprising that the max operation was observed to be very costly. 

6 Conclusion and Related Work 

In this paper, we have introduced a shape analysis approach for generating a 
finite over-approximation of the reach set of a graph transformation system with 
infinite state space. In contrast to some of the other work done in this area, 
e.g. IRen041 . we derive from our strict adherence to the formalism presented 
in |SRW02j a very straightforward avenue for implementation, which we have 
demonstrated using the 3-valued logic engine TVLA. In order to emphasize the 
qualities of our approach, we will now discuss how it relates to other work in 
this area. 

In |BBKR08] . a method for automatic abstraction of graphs is introduced. In- 
tuitively, nodes are identified if their neighbourhood of radius k g N is the same. 
While this automatic abstraction greatly reduces the need for human interven- 
tion in the verification process, it also reduces the flexibility of the approach. 
Only a certain class of systems can be handled well by neighbourhood abstrac- 



tion, while our approach can be tuned to fit the needs of very different systems 
on a per-system basis. Furthermore, the method from [BBKR08J cannot use in- 
formation from spurious counterexamples, since the abstraction leaves them with 
only one degree of freedom, the radius k. In contrast, using additional instru- 
mentation predicates, our approach can utilise the full amount of information 
from spurious counterexamples. 

Another approach to verifying infinite-state systems is the one by Saksena, 
Wibling and Jonsson |SWJ08| . It is based on backwards application of rules. 
By applying inverted rules to the forbidden patterns it is possible to determine 
whether a starting graph can lead to a failure state. The backwards application 
paradigm imposes some restrictions on this approach, for example forbidding 
the deletion of nodes and requiring a single starting pattern. Our approach does 
not suffer such restrictions. Furthermore, since the approach does not include 
an explicit abstraction and thus no information about the rest of the graph 
is available when applying a rule to a pattern, it would be very difficult to 
include concepts such as parameterised rules or parallel rule application. Since 
our approach uses explicit abstraction through shapes, it can encode information 
about the entire graph and is thus much more suited to support such extensions. 

Lastly, Baldan, Corradini and Konig [BCK08J have written a series of papers 
in which they develop a unique approach to the verification of infinite state 
GTSs. They relate GTSs to Petri nets and construct a combined formalism, 
called a petri graph, on which they show certain properties via a technique called 
unfolding. This approach achieves many of the goals we strive for. However, a 
single concrete start graph is required for an analysis, which would be a major 
restriction in systems where there are many possible initial states, or even an 
unknown initial state. 

The above discussion of related work is by no means exhaustive, but it suffices 
to show that, while each of these approaches has currently some advantages 
over our approach, no single approach outperforms ours in every single way. 
The results described in this paper lay the foundations for a new approach to 
the verification of infinite-state GTSs, which we strongly believe to be better 
suited to overcome the many problems facing any theory in this area, than the 
currently available approaches. As such, there are a number of limitations to our 
approach which we intend to tackle in the future. We plan to look at parallel 
rule application, negative application conditions HHT96] and especially rules 
with quantifiers [Ren06] which allow to specify changes on arbitrary numbers of 
nodes of some particular type within one rule. 
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